Search CORE

95 research outputs found

Negation and lexical morphology across languages: Insights from a trilingual translation corpus

Author: Cartoni Bruno
Lefer Marie-Aude
Publication venue
Publication date: 02/08/2017
Field of study

This paper proposes an exploratory cross-linguistic bird's eye-view of negative lexical morphology by examining English, French and Italian negative derivational affixes. More specifically, it aims to uncover the French and Italian equivalents of the English affixes de, dis, in, non, un and less. These include morphological equivalents (i.e. negative prefixes in French and Italian) as well as non-morphological equivalents (i.e. single words devoid of negative affixation, multi-word units or paraphrases). The study relies on a nine-million-word trilingual translation corpus made up of texts from the Europarl corpus and shows that the systematic analysis of translation data makes it possible to identify the major morphological dissimilarities between the three languages investigated. The frequent use of non-morphological translations in French and Italian reflects fundamental differences between the source language (English) and the two target lan-guages (French and Italian), hence pointing to possible translation difficulties. Morphological translations, on the other hand, bring to light cross-linguistic similarities in the use of negative affixe

RERO DOC Digital Library

A Task-based Evaluation of French Morphological Resources and Tools

Author: Bernhard Delphine
Cartoni Bruno
Tribout Delphine
Publication venue: Stanford Calif.: CSLI Publications
Publication date: 01/01/2011
Field of study

Morphology is a key component for many Language Technology applications. However, morphological relations, especially those relying on the derivation and compounding processes, are often addressed in a superﬁcial manner. In this article, we focus on assessing the relevance of deep and motivated morphological knowledge in Natural Language Processing applications. We ﬁrst describe an annotation experiment whose goal is to evaluate the role of morphology for one task, namely Question Answering (QA). We then highlight the kind of linguistic knowledge that is necessary for this particular task and propose a qualitative analysis of morphological phenomena in order to identify the morphological processes that are most relevant. Based on this study, we perform an intrinsic evaluation of existing tools and resources for French morphology, in order to quantify their coverage. Our conclusions provide helpful insights for using and building appropriate morphological resources and tools that could have a signiﬁcant impact on the application performance

Hal-Diderot

Annotating the meaning of discourse connectives by looking at their translation: The translation-spotting technique

Author: Cartoni Bruno
Meyer Thomas
Zufferey Sandrine
Publication venue: University of Illinois at Chicago Library
Publication date: 01/01/2013
Field of study

The various meanings of discourse connectives like while and however are difficult to identify and annotate, even for trained human annotators. This problem is all the more important that connectives are salient textual markers of cohesion and need to be correctly interpreted for many NLP applications. In this paper, we suggest an alternative route to reach a reliable annotation of connectives, by making use of the information provided by their translation in large parallel corpora. This method thus replaces the difficult explicit reasoning involved in traditional sense annotation by an empirical clustering of the senses emerging from the translations. We argue that this method has the advantage of providing more reliable reference data than traditional sense annotation. In addition, its simplicity allows for the rapid constitution of large annotated datasets

University of Illinois at Chicago: Journals@UIC

Infoscience - École polytechnique fédérale de Lausanne

Bern Open Repository and Information System (BORIS)

Dialogue & Discourse (E-Journal - Universität Bielefeld)

Word-formation in original and translated English: source language influence on the use of un- and less

Author: Cartoni Bruno
Saint-Léger Marie-Paule de
Publication venue
Publication date: 01/01/2013
Field of study

This article aims to assess whether the word-formation features of translated language, as opposed to original language, are source language (SL)-dependent or translation-related. To do so, we analyze the use of the -less and un- negative affixes in original English and in English translated from four SL: French, Italian, Dutch and German. Findings based on the Europarl corpus show that the use of -less and un- in translated English is partially SL-dependent

Repositori d'Objectes Digitals per a l'Ensenyament la Recerca i la Cultura

DIAL UCLouvain

DIAL USaint-Louis

Building 'directional corpora' for unbiased contrastive analysis

Author: Cartoni Bruno
Meyer Thomas
Publication venue
Publication date: 19/12/2013
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Extracting Directional and Comparable Corpora from a Multilingual Corpus for Translation Studies

Author: Cartoni Bruno
Meyer Thomas
Publication venue
Publication date: 19/12/2013
Field of study

Translation studies rely more and more on corpus data to examine specificities of translated texts, that can be translated from different original languages and compared to original texts. In parallel, more and more multilingual corpora are becoming available for various natural language processing tasks. This paper questions the use of these multilingual corpora in translation studies and shows the methodological steps needed in order to obtain more reliably comparable sub-corpora that consist of original and directly translated text only. Various experiments are presented that show the advantage of directional sub-corpora

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Annotating the meaning of discourse connectives by looking at their translation: The translation-spotting technique

Author: Cartoni Bruno
Meyer Thomas
Zufferey Sandrine
Publication venue: The Dialogue & Discourse Board of Editors
Publication date: 30/04/2013
Field of study

Dialogue & Discourse (E-Journal - Universität Bielefeld)

Machine Translation Evaluation beyond the Sentence Level

Author: Brovelli (Meyer) Thomas
Cartoni Bruno
Libovický Jindřich
Publication venue: European Association for Machine Translation
Publication date: 01/01/2018
Field of study

Automatic machine translation evaluation was crucial for the rapid development of machine translation systems over the last two decades. So far, most attention has been paid to the evaluation metrics that work with text on the sentence level and so did the translation systems. Across-sentence translation quality depends on discourse phenomena that may not manifest at all when staying within sentence boundaries (e.g. coreference, discourse connectives, verb tense sequence etc.). To tackle this, we propose several document-level MT evaluation metrics: generalizations of sentence-level metrics, language-(pair)-independent versions of lexical cohesion scores and coreference and morphology preservation in the target texts. We measure their agreement with human judgment on a newly created dataset of pairwise paragraph comparisons for four language pairs

Repositorio Institucional de la Universidad de Alicante

A Corpus-based Contrastive Analysis for Defining Minimal Semantics of Inter-sentential Dependencies for Machine Translation

Author: Cartoni Bruno
Liyanapathirana Jeevanthi
Meyer Thomas
Popescu-Belis Andrei
Publication venue
Publication date: 19/12/2013
Field of study

Inter-sentential dependencies such as discourse connectives or pronouns have an impact on the translation of these items. These dependencies have classically been analyzed within complex theoretical frameworks, often monolingual ones, and the resulting fine-grained descriptions, although relevant to translation, are likely beyond reach of statistical machine translation systems. Instead, we propose an approach to search for a minimal, feature-based characterization of translation divergencies due to inter-sentential dependencies, in the case of discourse connectives and pronouns, based on contrastive analyses performed on the Europarl corpus. In addition, we show how to automatically assign labels to connectives and pronouns, and how to use them for statistical machine translation

Infoscience - École polytechnique fédérale de Lausanne